-
Notifications
You must be signed in to change notification settings - Fork 184
pySCG: Adding explanation of the 'is' operator to CWE-595 #997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: edanhub <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
provided example01.py code in the md for the obsevered issues around numbers and strings to explain memory optimistation related isses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggstion for first sentence:
Prevent unexpected results by knowing what comparisment operators do such as ==
and is
actually do.
After the existing first sentence we shall also explain:
Python falls back to comparing objects id()
if the `eq implementation is missing for a custom class.
In Python, the `==` operator is implemented by the `__eq__` method on an object [[python.org data model 2023](https://docs.python.org/3/reference/datamodel.html?highlight=__eq__#object.__eq__)]. For built-in types like `int` and `str`, the comparison is implemented in the interpreter. The main issue comes when implementing custom classes, where the default implementation compares object references using the `is` operator. The `is` operator compares the identities of the objects, equivalent to `id(obj1) == id(obj2)`. The `id` function is built into Python, and in the CPython interpreter, the standard implementation, it returns the object's memory address [[de Langen 2023](https://realpython.com/python-is-identity-vs-equality/)]. | ||
|
||
You want to implement the `__eq__` method on a class if you believe you ever want to compare it to another object or find it in a list of objects. Actually, it is so common that the `dataclasses.dataclass` decorator by default implements it for you [[dataclasses — Data Classes — Python 3.11.4 documentation](https://docs.python.org/3/library/dataclasses.html#dataclasses.dataclass)]. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be aware of Python's memory optimization for strings and numbers as demonstrated in example01.py
code.
Python tries to avoid allocating more memory for the same string. Numbers -5
to 256
are so frequently used that they are pre-reserved.
# SPDX-FileCopyrightText: OpenSSF project contributors
# SPDX-License-Identifier: MIT
""" Code Example """
print("-" * 10 + "Memory optimization with strings" + 10 * "-")
a = "foobar"
b = "foobar"
c = ''.join(["foo", "bar"])
print(f"a is b: {a} is {b}?", a is b)
print(f"a is c: {a} is {c}?", a is c)
print(f"a == c: {a} == {c}?", a == c)
print(f"size? len(a)={len(a)} len(b)={len(b)} len(c)={len(c)}")
print("-" * 10 + "Memory optimization with numbers" + 10 * "-")
a = b = 256
print (f"{a} is {b}?", a is b)
a = b = 257
print (f"{a} is {b}?", a is b)
print("-" * 10 + "Memory optimization with numbers in a loop" + 10 * "-")
a = b = 255
while(a is b):
a += 1
b += 1
print (f"{a} is {b}?", a is b)
The example01.py
code output:
a is b: foobar is foobar? True
a is c: foobar is foobar? False
a == c: foobar == foobar? True
size? len(a)=6 len(b)=6 len(c)=6
----------Memory optimisation with numbers----------
256 is 256? True
257 is 257? True
----------Memory optimisation with numbers in a loop----------
256 is 256? True
257 is 257? False
The first print statement illustrates Python's memory optimization for strings.
See id(a)
and id(b)
printing the same object number compared to id{c}
printing a different id for t he same string.
The example in the middle creates same object number as a = b = 257
tells python that it can used the same memory.
The last print statements in example01.py
illustrate Python's memory optimization for numbers between -5 to 256
. Python needs to allocate new objects for numbers greater then 256
. Note that the behavior of Python for numbers between -5
and 256
change depending on how its run. Following code changes behavior when run via interactive Python shell or in a file:
example02.py:
a = 256
b = 256
print(a is b)
a = 257
b = 257
print(a is b)
Using an interactive Python shell will print True
and False
while running the code in a python script will print True
and True
as the runtime has optimized the code to preserve memory.
Signed-off-by: edanhub <[email protected]>
Signed-off-by: edanhub <[email protected]>
Co-authored-by: myteron <[email protected]> Signed-off-by: Hubert Daniszewski <[email protected]>
Signed-off-by: edanhub <[email protected]>
Addresses issue #714